This report provides a detailed workflow of the project on Homestead Tax Exemption entitlement assisstance outreach, for the City of Philadelphia Office of Philly Stat 360 and Office of Information Technology. The aim of the project is design an algorithm-driven outreach campaign that can cost effectively identify homeowners who are likely to be eligible for the Homestead Tax Exemption but are not participating in the program. The project aims to allow our clients to understand where these properties are located, potential outreach strategies, and the associated costs and benefits.
These relevant properties who are identified as most likely eligible for the Homestead Exemption but not taking up the program, are also thought to be more likely to be subject to “tangled titles,” or family-rental arrangements that require an affidavit to waive need for a rental license.
Property tax in Philadelphia is 1.3998% of the property value, as assessed by the Office of Property Assessment,for the 2025 taxx year. This is made up of 0.6159% (City of Philadelphia) and 0.7839% (School District) The taxes are due March 31st yearly.
The Homestead Exemption reduces the taxable portion of a homeowner’s property assessment by up to $100,000, saving up to $1,399 on real estate taxes annually. The bill signed aimed to lessen the financial burden of new property assessments on Philadelphia homeowners, whose property values increased by an average of 31% after the city delayed the annual calculations for three years due to the pandemic. Eligibility for the Homestead Exemption is as follows: • you must own the property and use it as your primary residence • no age or income restrictions • Not used exclusively for business purposes or as rental units (a percentage is fine)
A homeowner is Ineligible if a homeowner is already enrolled in these alternative real estate tax relief/abatement programs: • Longtime Owner Occupants Program (LOOP), an income-based program for homeowners who experience a substantial increase in their property assessment. • 10-year residential tax abatement program, although one can only apply for Homestead Exemption after the abatement is over
Programs that can be used in conjunction with the homestead exemption include • Owner-Occupied Real Estate Tax Payment Agreement (OOPA) • Senior Citizen Real Estate Tax Freeze • Low-Income Real Estate Tax Freeze • Real Estate Tax Installment Plan • Tax Credits for Active-Duty Reserve and National Guard Members
An issue of concern that may result in a long-term resident not being able to claim for homestead exemption is tangled titles, which occur when a long-term resident effectively functions as a homeowner but lacks legal ownership of the property. This often happens when a family member who owned the property passes away, and the necessary legal processes to formalize the ownership transfer were never completed, leaving the resident ineligible for the exemption. However, Philadelphia has a conditional Homestead Exemption of three years for such cases while the legal transfer of ownership is resolved.
Currently, no focused or strategic efforts are being carried out by to identify and reach homeowners who is not enrolled in the Homestead Exemption. Through an accurate identification of eligible homeowners, a cost-effective and efficient targeted outreach will be possible, enabling these homeowners to be made aware of and receive support in keeping their home.
The primary dataset used is the Property and Assessment History publicly available for download on OpenDataPhilly. Six relevant datasets are merged with this primary dataset with common identifying keys such as the parcel number in order to include useful predictor variabels in the model predicting for homeowners most likely eligible but not currently enrolled in the Homestead Exemption.
Every observation in the Property and Assessment History dataset is one property in Philadelphia, with a total of 584,049 properties and 79 features. As this dataset is updated daily, the one used for this project is updated as of 31 January 2025.
There is a column ‘homestead_exemption’ within this dataset which indicates the taxable portion amount removed from the property assessment of the house. It should be noted that there are 14 properties that had a homestead exemption larger then $100,000, the maximum possible amount, which is suspected to be a clerical error and has been flagged to the PhillyStat360 team. The dependent variable for the model is derived from this feature by creating a binary variable on whether or not the property is currently enrolled in the homestead exemption program, indicated by a non-zero value. There are 246,853 properties with a homestead exemption.
properties <- fread("Data/opa_properties_public.csv")
filtered_properties <- properties %>%
mutate(exemption = ifelse(homestead_exemption == 0, 0, 1))
filtered_properties <- filtered_properties %>% mutate(is_residential = ifelse(zoning %in% c(
"RM1", "RM2", "RM3", "RM4",
"RSA1", "RSA2", "RSA3", "RSA4", "RSA5", "RSA6",
"RSD1", "RSD2", "RSD3",
"RM1|RSA5", "RSD1|RSD3", "RSA5|RSA5",
"RTA1",
"CMX1", "CMX2", "CMX2.5", "CMX3", "CMX4", "CMX5", "IRMX"), 1, 0))
# Look into exemption status by zoning code
exemptionbyzoning <- filtered_properties %>%
group_by(zoning, exemption) %>%
summarise(count = n(), .groups = "drop") %>%
tidyr::pivot_wider(names_from = exemption, values_from = count, values_fill = list(count = 0)) %>%
rename(No_Exemption = `0`, Exemption = `1`)
# Look into number of blank / NA zoning codes
num_blank_zoning <- properties %>%
filter(zoning == "" | is.na(zoning) | str_trim(zoning) == "") %>%
nrow()
print(num_blank_zoning)
## [1] 2464
residential_properties <- filtered_properties %>% filter(is_residential == 1)
properties_sf <- st_as_sf(residential_properties, wkt = "shape", crs = 2272)
census_tracts <- st_read("Data/phila_census1.gpkg")
## Reading layer `phila_census' from data source
## `C:\Users\14735\WPSDrive\376583023\WPS云盘\0_MCP\25spring\Smart Cities Practicum\Philly-Homeowners\Data\phila_census1.gpkg'
## using driver `GPKG'
## Simple feature collection with 408 features and 31 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: -75.28027 ymin: 39.867 xmax: -74.95576 ymax: 40.13799
## Geodetic CRS: NAD83
census_tracts <- st_transform(census_tracts, 2272)
properties_tract <- st_join(properties_sf, census_tracts)
# Create tract summary
tract_summary <- properties_tract %>%
group_by(GEOID) %>%
summarise(
total_properties = n(),
homestead_count = sum(homestead_exemption > 0, na.rm = TRUE),
pct_homestead = (homestead_count / total_properties) * 100,
.groups = "drop"
) %>%
st_drop_geometry()
# Create final enriched dataset
census_tracts_enriched <- census_tracts %>%
left_join(tract_summary, by = "GEOID")
One major issue faces is the large number of NA values. Even for those features with a low number of NA values indicated in this table, further investigation reveals that there are many empty cells
#Number of NA values
print(residential_properties %>%
summarise(across(everything(), ~sum(is.na(.)))) %>%
tidyr::pivot_longer(cols = everything(), names_to = "Column", values_to = "NA_Count"), n = 100)
## # A tibble: 81 × 2
## Column NA_Count
## <chr> <int>
## 1 objectid 0
## 2 assessment_date 13
## 3 basements 0
## 4 beginning_point 0
## 5 book_and_page 0
## 6 building_code 0
## 7 building_code_description 0
## 8 category_code 0
## 9 category_code_description 0
## 10 census_tract 12
## 11 central_air 0
## 12 cross_reference 564194
## 13 date_exterior_condition 564194
## 14 depth 3464
## 15 exempt_building 13
## 16 exempt_land 13
## 17 exterior_condition 0
## 18 fireplaces 76738
## 19 frontage 3488
## 20 fuel 0
## 21 garage_spaces 78050
## 22 garage_type 530843
## 23 general_construction 0
## 24 geographic_ward 0
## 25 homestead_exemption 0
## 26 house_extension 0
## 27 house_number 0
## 28 interior_condition 0
## 29 location 0
## 30 mailing_address_1 0
## 31 mailing_address_2 564194
## 32 mailing_care_of 0
## 33 mailing_city_state 0
## 34 mailing_street 0
## 35 mailing_zip 0
## 36 market_value 13
## 37 market_value_date 564194
## 38 number_of_bathrooms 76387
## 39 number_of_bedrooms 72099
## 40 number_of_rooms 551851
## 41 number_stories 64549
## 42 off_street_open 6749
## 43 other_building 0
## 44 owner_1 0
## 45 owner_2 0
## 46 parcel_number 0
## 47 parcel_shape 0
## 48 quality_grade 0
## 49 recording_date 3495
## 50 registry_number 0
## 51 sale_date 2163
## 52 sale_price 2187
## 53 separate_utilities 0
## 54 sewer 0
## 55 site_type 564194
## 56 state_code 2
## 57 street_code 12
## 58 street_designation 0
## 59 street_direction 0
## 60 street_name 0
## 61 suffix 0
## 62 taxable_building 13
## 63 taxable_land 13
## 64 topography 0
## 65 total_area 603
## 66 total_livable_area 39429
## 67 type_heater 0
## 68 unfinished 564194
## 69 unit 0
## 70 utility 564194
## 71 view_type 0
## 72 year_built 39426
## 73 year_built_estimate 0
## 74 zip_code 0
## 75 zoning 0
## 76 pin 0
## 77 building_code_new 0
## 78 building_code_description_new 0
## 79 shape 0
## 80 exemption 0
## 81 is_residential 0
# Transform data to WGS84 (required for leaflet)
census_tracts_wgs84 <- st_transform(census_tracts_enriched, 4326)
# Create interactive map
leaflet(census_tracts_wgs84) %>%
addTiles() %>% # Add OpenStreetMap base map
addPolygons(
fillColor = ~colorNumeric(
palette = "viridis",
domain = c(0, 100)
)(pct_homestead),
fillOpacity = 0.7,
weight = 1,
color = "white",
popup = ~paste(
"Census Tract:", GEOID, "<br>",
"Homestead %:", round(pct_homestead, 1), "<br>",
"Population Density:", round(pop_density, 0), "<br>",
"Total Properties:", total_properties, "<br>",
"Median Income:", scales::dollar(median_income)
)
) %>%
addLegend(
position = "bottomright",
pal = colorNumeric("viridis", domain = c(0, 100)),
values = ~pct_homestead,
title = "% Homestead Exemption",
opacity = 0.7
)
census_tracts_filtered <- census_tracts_enriched %>%
filter(pop_density > 0 & total_properties >= 30) # 30 properties minimum
census_tracts_invalid <- census_tracts_enriched %>%
filter(pop_density == 0 | total_properties < 30) # Include low property counts in "invalid"
# Transform both to WGS84
census_tracts_invalid_wgs84 <- st_transform(census_tracts_invalid, 4326)
census_tracts_filtered_wgs84 <- st_transform(census_tracts_filtered, 4326)
# Create map with both layers
leaflet() %>%
addTiles() %>%
# Add invalid tracts first (in gray)
addPolygons(data = census_tracts_invalid_wgs84,
fillColor = "gray",
fillOpacity = 0.5,
weight = 1,
color = "white",
popup = "No data available"
) %>%
# Add valid tracts with your original styling
addPolygons(data = census_tracts_filtered_wgs84,
fillColor = ~colorNumeric(
palette = "viridis",
domain = c(0, 85)
)(pct_homestead),
fillOpacity = 0.7,
weight = 1,
color = "white",
popup = ~paste(
"Census Tract:", GEOID, "<br>",
"Homestead %:", round(pct_homestead, 1), "%<br>",
"Total Properties:", total_properties,
ifelse(total_properties < 100,
"<br><i style='color:red'>Note: Low property count may affect reliability</i>",
"")
)
)%>%
addLegend(
position = "bottomright",
pal = colorNumeric("viridis", domain = c(0, 85)),
values = census_tracts_filtered_wgs84$pct_homestead,
title = "% Homestead Exemption",
opacity = 0.7,
labFormat = labelFormat(suffix = "%")
) %>%
# Add legend for gray areas
addLegend(
position = "bottomright",
colors = "gray",
labels = "No Data Available",
opacity = 0.5
)
# Create histogram of homestead exemption distribution
census_hist <- ggplot(census_tracts_enriched %>%
filter(pop_density > 0), # Filter out zero population tracts
aes(x = pct_homestead)) +
geom_histogram(
binwidth = 5,
fill = "#008d8a",
color = "white"
) +
labs(
title = "Distribution of Homestead Exemption Rates Across Philadelphia Census Tracts",
subtitle = "Excluding Zero Population Density Tracts",
x = "Percentage of Properties with Homestead Exemption",
y = "Number of Census Tracts"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_text(size = 14),
legend.text = element_text(size = 14)
) +
scale_x_continuous(breaks = seq(0, 100, by = 10))
census_hist
The distribution shows a roughly normal shape with most tracts clustered
between 30-50% There’s a notable drop-off below 30% in the number of
tracts The histogram shows relatively few tracts with rates below
20%
Therefore, census tracts with homestead exemption rates below 30% could be considered to have low enrollment and might warrant targeted outreach or investigation into barriers to participation, assuming they are primarily residential areas and not institutional/special use tracts.
There’s a notable drop-off below 30% in the number of tracts. Therefore, census tracts with homestead exemption rates below 30% could be considered to have low enrollment and might warrant targeted outreach or investigation into barriers to participation, assuming they are primarily residential areas.
homestead_pattern <- ggplot(census_tracts_enriched) +
geom_sf(aes(fill = cut(pct_homestead,
breaks = c(0, 20, 30, 40, 50, 60, 100),
labels = c("<20%", "20-30%", "30-40%", "40-50%", "50-60%", ">60%")))) +
scale_fill_viridis_d(
name = "Homestead\nExemption Rate",
na.value = "gray80",
guide = guide_legend(reverse = TRUE)
) +
labs(
title = "Homestead Exemption Rates Across Philadelphia",
subtitle = "By Census Tract (Excluding Zero Population Areas)",
caption = "Gray areas indicate zero population density tracts"
) +
theme_minimal() +
theme(
panel.grid = element_blank(),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_blank(),
legend.text = element_text(size = 14)
)
homestead_pattern
#ggsave("outputs/homestead-exemption-pattern.png", homestead_pattern, width = 10, height = 6)
# Basic summary statistics
summary(census_tracts_enriched$pop_density)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0 10522 18246 19616 27127 92575
# More detailed statistics
census_tracts_enriched %>%
summarise(
mean_density = mean(pop_density, na.rm = TRUE),
median_density = median(pop_density, na.rm = TRUE),
q1 = quantile(pop_density, 0.25, na.rm = TRUE),
q3 = quantile(pop_density, 0.75, na.rm = TRUE)
)
## Simple feature collection with 1 feature and 4 fields
## Geometry type: POLYGON
## Dimension: XY
## Bounding box: xmin: 2660586 ymin: 204650.6 xmax: 2750109 ymax: 304965.3
## Projected CRS: NAD83 / Pennsylvania South (ftUS)
## mean_density median_density q1 q3 geom
## 1 19615.79 18246.11 10521.61 27126.94 POLYGON ((2679429 207838.9,...
# Visual distribution
ggplot(census_tracts_enriched, aes(x = pop_density)) +
geom_histogram(binwidth = 1000) +
theme_minimal() +
labs(title = "Distribution of Population Density in Philadelphia Census Tracts",
x = "Population Density (per square mile)",
y = "Count of Census Tracts")
ggplot(census_tracts_enriched, aes(x = pop_density)) +
geom_histogram(binwidth = 1000) +
theme_minimal() +
labs(title = "Distribution of Population Density in Philadelphia Census Tracts",
x = "Population Density (per square mile)",
y = "Count of Census Tracts")
homestead_pattern <- ggplot(census_tracts_enriched %>%
mutate(pct_homestead = case_when(
pop_density < 2000 | total_properties < 100 ~ NA_real_,
TRUE ~ pct_homestead))) +
geom_sf(aes(fill = cut(pct_homestead,
breaks = c(0, 20, 30, 40, 50, 60, 100),
labels = c("<20%", "20-30%", "30-40%", "40-50%", "50-60%", ">60%")))) +
scale_fill_viridis_d(
name = "Homestead\nExemption Rate",
na.value = "gray80",
guide = guide_legend(reverse = TRUE)
) +
labs(
title = "Homestead Exemption Rates Across Philadelphia",
subtitle = "By Census Tract",
caption = "Gray tracts indicate population density < 2,000 per sq. mile or fewer than 100 properties"
) +
theme_minimal() +
theme(
panel.grid = element_blank(),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_blank(),
legend.text = element_text(size = 14)
)
homestead_pattern
# Distribution of pct_homestead
summary(tract_summary$pct_homestead)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 28.08 41.50 41.73 54.92 100.00
# Map for low rates (under 30%)
ggplot(census_tracts_enriched) +
geom_sf(aes(fill = ifelse(pop_density > 0 & pct_homestead < 30,
pct_homestead, NA))) +
scale_fill_viridis_c(
name = "% Homestead\nExemption\n(Under 30%)",
na.value = "gray80",
limits = c(0, 30),
breaks = seq(0, 30, by = 5)
) +
labs(
title = "Low Homestead Exemption Rates in Philadelphia",
subtitle = "Census Tracts Below 30% Enrollment",
caption = "Gray areas: Zero population density or rates ≥ 30%"
) +
map_theme
# Table for low rates
low_enrollment_tracts <- census_tracts_enriched %>%
filter(pop_density > 0 & pct_homestead < 30) %>%
select(GEOID, pct_homestead, pop_density) %>%
arrange(pct_homestead)
low_enrollment_tracts %>%
st_drop_geometry() %>%
arrange(pct_homestead) %>%
kable(
col.names = c("Census Tract", "% Homestead", "Population Density"),
digits = 2,
caption = "Census Tracts with Low Homestead Exemption Rates (<30%)"
)
| Census Tract | % Homestead | Population Density |
|---|---|---|
| 42101008801 | 0.00 | 31288.99 |
| 42101012201 | 0.00 | 20308.08 |
| 42101012203 | 0.00 | 12216.50 |
| 42101036901 | 0.00 | 66.50 |
| 42101036902 | 0.00 | 18371.62 |
| 42101989300 | 0.00 | 145.65 |
| 42101012501 | 0.65 | 19593.83 |
| 42101000600 | 2.15 | 23041.85 |
| 42101008802 | 3.82 | 40358.85 |
| 42101000500 | 5.37 | 18466.26 |
| 42101015300 | 8.35 | 27884.65 |
| 42101015600 | 9.65 | 10032.78 |
| 42101014700 | 11.48 | 25395.42 |
| 42101014000 | 12.15 | 33879.19 |
| 42101016200 | 12.65 | 16491.54 |
| 42101000702 | 13.26 | 54828.50 |
| 42101009000 | 14.26 | 36339.49 |
| 42101016500 | 14.26 | 18703.31 |
| 42101000404 | 14.69 | 49405.57 |
| 42101010600 | 15.91 | 17551.67 |
| 42101010800 | 16.28 | 20039.58 |
| 42101016300 | 16.34 | 11835.92 |
| 42101014500 | 16.41 | 17870.04 |
| 42101010900 | 16.48 | 24584.09 |
| 42101016600 | 16.53 | 22256.69 |
| 42101016400 | 16.91 | 21458.61 |
| 42101014800 | 17.14 | 12169.05 |
| 42101000901 | 17.73 | 53169.00 |
| 42101008702 | 18.11 | 25905.00 |
| 42101017701 | 18.38 | 34113.71 |
| 42101015200 | 18.43 | 23128.35 |
| 42101037700 | 18.46 | 18800.68 |
| 42101013300 | 18.56 | 25325.92 |
| 42101016702 | 18.56 | 18781.08 |
| 42101013100 | 18.67 | 15041.99 |
| 42101037600 | 19.03 | 12327.66 |
| 42101017800 | 19.41 | 25766.40 |
| 42101020000 | 19.65 | 9370.00 |
| 42101000101 | 19.76 | 14330.64 |
| 42101014400 | 19.84 | 22719.23 |
| 42101017601 | 19.97 | 22499.74 |
| 42101013900 | 20.53 | 11893.19 |
| 42101000701 | 20.93 | 33310.62 |
| 42101020101 | 21.32 | 17117.37 |
| 42101009100 | 21.42 | 18026.35 |
| 42101029400 | 21.46 | 15360.37 |
| 42101017400 | 21.82 | 17969.72 |
| 42101010700 | 21.97 | 18285.44 |
| 42101013200 | 22.02 | 17949.73 |
| 42101006600 | 22.24 | 13505.86 |
| 42101014100 | 22.31 | 13739.18 |
| 42101013800 | 22.33 | 15763.50 |
| 42101006300 | 22.40 | 20581.10 |
| 42101000805 | 22.57 | 92575.12 |
| 42101015101 | 22.71 | 28989.00 |
| 42101011000 | 22.72 | 13900.65 |
| 42101029300 | 22.83 | 13460.26 |
| 42101014202 | 22.95 | 11438.46 |
| 42101000401 | 23.06 | 31951.13 |
| 42101024100 | 23.32 | 8896.72 |
| 42101017702 | 23.46 | 25018.11 |
| 42101036700 | 23.72 | 12813.19 |
| 42101020300 | 23.89 | 13067.52 |
| 42101016701 | 24.19 | 35668.68 |
| 42101000102 | 24.37 | 16883.65 |
| 42101018801 | 24.43 | 23895.47 |
| 42101002000 | 24.57 | 18106.87 |
| 42101000200 | 24.59 | 23376.36 |
| 42101016100 | 24.71 | 22414.27 |
| 42101015102 | 24.77 | 22941.91 |
| 42101003300 | 24.80 | 15304.04 |
| 42101016901 | 25.09 | 16946.58 |
| 42101014900 | 25.15 | 24675.81 |
| 42101038100 | 25.29 | 608.11 |
| 42101003100 | 25.36 | 32820.35 |
| 42101017500 | 25.97 | 24161.65 |
| 42101024600 | 26.04 | 10639.09 |
| 42101008701 | 26.16 | 33669.26 |
| 42101003200 | 26.49 | 22098.03 |
| 42101016902 | 26.53 | 18034.08 |
| 42101013701 | 26.66 | 17167.16 |
| 42101011100 | 26.72 | 8056.80 |
| 42101037800 | 26.74 | 1524.12 |
| 42101016800 | 26.94 | 17662.29 |
| 42101009200 | 27.30 | 17421.38 |
| 42101024500 | 27.51 | 15856.83 |
| 42101008602 | 27.52 | 22227.50 |
| 42101013702 | 27.81 | 33356.17 |
| 42101014300 | 27.83 | 8977.12 |
| 42101029900 | 27.91 | 19596.37 |
| 42101010500 | 28.00 | 16738.99 |
| 42101017900 | 28.16 | 24208.45 |
| 42101018802 | 28.32 | 43792.22 |
| 42101030000 | 28.44 | 23583.79 |
| 42101009400 | 28.55 | 27290.24 |
| 42101017602 | 28.60 | 26348.66 |
| 42101000902 | 28.80 | 45668.26 |
| 42101004101 | 28.82 | 34632.43 |
| 42101014201 | 28.96 | 29354.24 |
| 42101015700 | 29.17 | 13311.10 |
| 42101007700 | 29.41 | 14586.80 |
| 42101019200 | 29.49 | 34549.40 |
| 42101010300 | 29.64 | 24235.53 |
| 42101013500 | 29.68 | 26085.90 |
| 42101009500 | 29.71 | 30478.18 |
# Map for high rates (over 60%)
ggplot(census_tracts_enriched) +
geom_sf(aes(fill = ifelse(pop_density > 0 & pct_homestead > 60,
pct_homestead, NA))) +
scale_fill_viridis_c(
name = "% Homestead\nExemption\n(Over 60%)",
na.value = "gray80",
limits = c(60, 82),
breaks = seq(60, 80, by = 5)
) +
labs(
title = "High Homestead Exemption Rates in Philadelphia",
subtitle = "Census Tracts Above 60% Enrollment",
caption = "Gray areas: Zero population density or rates ≤ 60%"
) +
map_theme
# Table for high rates
low_enrollment_tracts <- census_tracts_enriched %>%
filter(pop_density > 0 & pct_homestead < 30) %>%
select(GEOID, pct_homestead, pop_density) %>%
arrange(pct_homestead)
low_enrollment_tracts %>%
st_drop_geometry() %>%
arrange(pct_homestead) %>%
kable(
col.names = c("Census Tract", "% Homestead", "Population Density"),
digits = 2,
caption = "Census Tracts with Low Homestead Exemption Rates (<30%)"
)
| Census Tract | % Homestead | Population Density |
|---|---|---|
| 42101008801 | 0.00 | 31288.99 |
| 42101012201 | 0.00 | 20308.08 |
| 42101012203 | 0.00 | 12216.50 |
| 42101036901 | 0.00 | 66.50 |
| 42101036902 | 0.00 | 18371.62 |
| 42101989300 | 0.00 | 145.65 |
| 42101012501 | 0.65 | 19593.83 |
| 42101000600 | 2.15 | 23041.85 |
| 42101008802 | 3.82 | 40358.85 |
| 42101000500 | 5.37 | 18466.26 |
| 42101015300 | 8.35 | 27884.65 |
| 42101015600 | 9.65 | 10032.78 |
| 42101014700 | 11.48 | 25395.42 |
| 42101014000 | 12.15 | 33879.19 |
| 42101016200 | 12.65 | 16491.54 |
| 42101000702 | 13.26 | 54828.50 |
| 42101009000 | 14.26 | 36339.49 |
| 42101016500 | 14.26 | 18703.31 |
| 42101000404 | 14.69 | 49405.57 |
| 42101010600 | 15.91 | 17551.67 |
| 42101010800 | 16.28 | 20039.58 |
| 42101016300 | 16.34 | 11835.92 |
| 42101014500 | 16.41 | 17870.04 |
| 42101010900 | 16.48 | 24584.09 |
| 42101016600 | 16.53 | 22256.69 |
| 42101016400 | 16.91 | 21458.61 |
| 42101014800 | 17.14 | 12169.05 |
| 42101000901 | 17.73 | 53169.00 |
| 42101008702 | 18.11 | 25905.00 |
| 42101017701 | 18.38 | 34113.71 |
| 42101015200 | 18.43 | 23128.35 |
| 42101037700 | 18.46 | 18800.68 |
| 42101013300 | 18.56 | 25325.92 |
| 42101016702 | 18.56 | 18781.08 |
| 42101013100 | 18.67 | 15041.99 |
| 42101037600 | 19.03 | 12327.66 |
| 42101017800 | 19.41 | 25766.40 |
| 42101020000 | 19.65 | 9370.00 |
| 42101000101 | 19.76 | 14330.64 |
| 42101014400 | 19.84 | 22719.23 |
| 42101017601 | 19.97 | 22499.74 |
| 42101013900 | 20.53 | 11893.19 |
| 42101000701 | 20.93 | 33310.62 |
| 42101020101 | 21.32 | 17117.37 |
| 42101009100 | 21.42 | 18026.35 |
| 42101029400 | 21.46 | 15360.37 |
| 42101017400 | 21.82 | 17969.72 |
| 42101010700 | 21.97 | 18285.44 |
| 42101013200 | 22.02 | 17949.73 |
| 42101006600 | 22.24 | 13505.86 |
| 42101014100 | 22.31 | 13739.18 |
| 42101013800 | 22.33 | 15763.50 |
| 42101006300 | 22.40 | 20581.10 |
| 42101000805 | 22.57 | 92575.12 |
| 42101015101 | 22.71 | 28989.00 |
| 42101011000 | 22.72 | 13900.65 |
| 42101029300 | 22.83 | 13460.26 |
| 42101014202 | 22.95 | 11438.46 |
| 42101000401 | 23.06 | 31951.13 |
| 42101024100 | 23.32 | 8896.72 |
| 42101017702 | 23.46 | 25018.11 |
| 42101036700 | 23.72 | 12813.19 |
| 42101020300 | 23.89 | 13067.52 |
| 42101016701 | 24.19 | 35668.68 |
| 42101000102 | 24.37 | 16883.65 |
| 42101018801 | 24.43 | 23895.47 |
| 42101002000 | 24.57 | 18106.87 |
| 42101000200 | 24.59 | 23376.36 |
| 42101016100 | 24.71 | 22414.27 |
| 42101015102 | 24.77 | 22941.91 |
| 42101003300 | 24.80 | 15304.04 |
| 42101016901 | 25.09 | 16946.58 |
| 42101014900 | 25.15 | 24675.81 |
| 42101038100 | 25.29 | 608.11 |
| 42101003100 | 25.36 | 32820.35 |
| 42101017500 | 25.97 | 24161.65 |
| 42101024600 | 26.04 | 10639.09 |
| 42101008701 | 26.16 | 33669.26 |
| 42101003200 | 26.49 | 22098.03 |
| 42101016902 | 26.53 | 18034.08 |
| 42101013701 | 26.66 | 17167.16 |
| 42101011100 | 26.72 | 8056.80 |
| 42101037800 | 26.74 | 1524.12 |
| 42101016800 | 26.94 | 17662.29 |
| 42101009200 | 27.30 | 17421.38 |
| 42101024500 | 27.51 | 15856.83 |
| 42101008602 | 27.52 | 22227.50 |
| 42101013702 | 27.81 | 33356.17 |
| 42101014300 | 27.83 | 8977.12 |
| 42101029900 | 27.91 | 19596.37 |
| 42101010500 | 28.00 | 16738.99 |
| 42101017900 | 28.16 | 24208.45 |
| 42101018802 | 28.32 | 43792.22 |
| 42101030000 | 28.44 | 23583.79 |
| 42101009400 | 28.55 | 27290.24 |
| 42101017602 | 28.60 | 26348.66 |
| 42101000902 | 28.80 | 45668.26 |
| 42101004101 | 28.82 | 34632.43 |
| 42101014201 | 28.96 | 29354.24 |
| 42101015700 | 29.17 | 13311.10 |
| 42101007700 | 29.41 | 14586.80 |
| 42101019200 | 29.49 | 34549.40 |
| 42101010300 | 29.64 | 24235.53 |
| 42101013500 | 29.68 | 26085.90 |
| 42101009500 | 29.71 | 30478.18 |
# Distribution of pct_homestead with new criteria
summary(tract_summary$pct_homestead[tract_summary$pop_density >= 2000 & tract_summary$total_properties >= 100])
## Min. 1st Qu. Median Mean 3rd Qu. Max.
##
# Map for low rates (under 30%)
ggplot(census_tracts_enriched) +
geom_sf(aes(fill = ifelse(pop_density >= 2000 & total_properties >= 100 & pct_homestead < 30,
pct_homestead, NA))) +
scale_fill_viridis_c(
name = "% Homestead\nExemption\n(Under 30%)",
na.value = "gray80",
limits = c(0, 30),
breaks = seq(0, 30, by = 5)
) +
labs(
title = "Low Homestead Exemption Rates in Philadelphia",
subtitle = "Census Tracts Below 30% Enrollment",
caption = "Gray areas: Low density (<2000/sq mi), low property count (<100), or rates ≥ 30%"
) +
map_theme
# Map for high rates (over 60%)
ggplot(census_tracts_enriched) +
geom_sf(aes(fill = ifelse(pop_density >= 2000 & total_properties >= 100 & pct_homestead > 60,
pct_homestead, NA))) +
scale_fill_viridis_c(
name = "% Homestead\nExemption\n(Over 60%)",
na.value = "gray80",
limits = c(60, 82),
breaks = seq(60, 80, by = 5)
) +
labs(
title = "High Homestead Exemption Rates in Philadelphia",
subtitle = "Census Tracts Above 60% Enrollment",
caption = "Gray areas: Low density (<2000/sq mi), low property count (<100), or rates ≤ 60%"
) +
map_theme
census_tracts_enriched <- census_tracts_enriched %>%
mutate(
owner_occ_rate = (census_tracts_enriched$owner_hh / census_tracts_enriched$occupied_units) * 100
)
low_enrollment_tracts <- census_tracts_enriched %>%
filter(pop_density >= 2000 &
total_properties >= 100 &
pct_homestead < 30 &
owner_occ_rate > 40) %>%
select(GEOID, pct_homestead, pop_density, total_properties) %>%
arrange(pct_homestead)
# For the map visualization
low_enrollment_tracts_map <- ggplot(census_tracts_enriched) +
geom_sf(aes(fill = case_when(
pop_density >= 2000 &
total_properties >= 100 &
pct_homestead < 30 &
owner_occ_rate > 40 ~ "#f4aa9e",
TRUE ~ "gray80"
))) +
scale_fill_identity() +
labs(
title = "Low Homestead Exemption Tracts",
subtitle = "Tracts with <30% Homestead Rate & >40% Owner Occupancy",
caption = "Gray areas do not meet filtering criteria"
) +
map_theme
low_enrollment_tracts_map
#ggsave("outputs/homestead-exemption-low_enrollment_tracts.png", low_enrollment_tracts_map, width = 10, height = 6)
ggplot(census_tracts_enriched %>%
filter(pop_density >= 2000),
aes(x = owner_occ_rate)) +
geom_histogram(
binwidth = 5,
fill = "#e42524",
alpha = 0.8,
color = "white"
) +
labs(
title = "Distribution of Owner Occupancy Rates Across Philadelphia Census Tracts",
subtitle = "Excluding Low Density Areas (<2,000 per square mile)",
x = "Owner Occupancy Rate (%)",
y = "Number of Census Tracts"
) +
theme_minimal() +
theme(
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_text(size = 12)
) +
scale_x_continuous(breaks = seq(0, 100, by = 10))
The first predictor variable is based off the key eligibility criteria of the Homestead Exemption program – whether a homeowner resides long-term in the property itself. The closest proxy of this is whether the mailing street (‘mailing_street’) address is the same as the property street address (‘location’), as the mailing address of the owner is often where the owner lives long-term.
The second predictor variable is based off the ineligibility for the homestead exemption if the homeowner is already enrolled in particular exisitng tax programs such as the LOOP and the residential tax abatement. This is determined if there is an existing tax relief that exempts a portion of the building from tax. If there is a non-zero value for ‘exempt-building’ and the property is not currently enrolled in the homestead exemption, it is potentially already enrolled in LOOP or in residential tax abatement.
residential_properties <- residential_properties %>% mutate(same_address = ifelse(mailing_street == location, 1, 0))
ggplot(residential_properties, aes(x = same_address, fill = as.factor(exemption))) +
geom_bar(position = "dodge") +
scale_fill_manual(values = colors, labels = c("No Exemption", "With Exemption")) +
labs(title = "Mailing Address Matches Property Address", x = "Address Match", y = "Count", fill = "Homestead Exemption") +
theme_minimal(base_size = 14) +
theme(panel.grid.major = element_line(color = "grey90"),
panel.grid.minor = element_blank(),
legend.position = "bottom",
plot.title = element_text(vjust = 0.5))
residential_properties <- residential_properties %>% mutate(potential_otherprog = ifelse(exempt_building > 0 & exemption == 0, 1, 0))
Depth of the property as well as the total property area were also predictors, although not as useful as the eligibility criteria.
residential_properties <- residential_properties %>% mutate(is_deep = ifelse(depth > 150, 1, 0))
ggplot(residential_properties %>%
filter(!is.na(is_deep)),
aes(x = factor(is_deep), fill = factor(exemption))) +
geom_bar(position = "dodge") +
scale_fill_manual(values = colors, labels = c("No Exemption", "With Exemption")) +
labs(title = "Property Depth by Homestead Exemption",
x = "Depth > 300",
y = "Count",
fill = "Exemption") +
theme_minimal(base_size = 14) +
theme(panel.grid.major = element_line(color = "grey90"),
panel.grid.minor = element_blank(),
legend.position = "bottom",
plot.title = element_text(vjust = 0.5))
residential_properties %>%
count(is_deep, exemption) %>%
tidyr::spread(key = exemption, value = n, fill = 0)
## is_deep 0 1
## 1 0 307320 234743
## 2 1 9985 8682
## 3 NA 3358 106
residential_properties %>%
count(same_address, exemption) %>%
tidyr::spread(key = exemption, value = n, fill = 0)
## same_address 0 1
## 1 0 197822 23683
## 2 1 122841 219848
residential_properties <- residential_properties %>% mutate(large_area = ifelse(total_area > 150000, 1, 0))
residential_properties %>%
count(large_area, exemption) %>%
tidyr::spread(key = exemption, value = n, fill = 0)
## large_area 0 1
## 1 0 319463 243508
## 2 1 603 17
## 3 NA 597 6
These two predictor variables are based on the key eligibility criteria of the Homestead Exemption program, that a property must not be used exclusively for business or rental purposes (partial use is allowed). We divided this criterion into two sections: one assessing the potential for exclusive business use and the other for exclusive rental use. A property is considered to have rental potential if it holds an active rental license, and business potential if it has an active business license (excluding rental licenses). Both variables are binary (has/does not have).
business_license<-st_read("Data/business_licenses.geojson")
## Reading layer `business_licenses' from data source
## `C:\Users\14735\WPSDrive\376583023\WPS云盘\0_MCP\25spring\Smart Cities Practicum\Philly-Homeowners\Data\business_licenses.geojson'
## using driver `GeoJSON'
## replacing null geometries with empty geometries
## Simple feature collection with 425305 features and 42 fields (with 22253 geometries empty)
## Geometry type: POINT
## Dimension: XY
## Bounding box: xmin: -75.27421 ymin: 39.88002 xmax: -74.95819 ymax: 40.1374
## Geodetic CRS: WGS 84
rental_license<-business_license%>%
filter(licensetype=="Rental",
#rentalcategory=="Residential Dwellings",
licensestatus %in% c("Active"))%>%
select(opa_account_num)%>%
st_drop_geometry()%>%
distinct() %>%
filter(!is.na(opa_account_num)) %>%
mutate(rental_license = 1)
properties_rental <- properties_sf %>%
mutate(parcel_number = as.character(parcel_number))%>%
left_join(rental_license, by = c("parcel_number" = "opa_account_num"))%>%
mutate(rental_license = replace_na(rental_license, 0))
ggplot(properties_rental, aes(x = factor(rental_license), fill = factor(exemption))) +
geom_bar(position = "dodge") +
geom_text(stat = "count", aes(label = ..count.., color = factor(exemption)),
position = position_dodge(width = 0.9),
vjust = -0.5,
size = 3) +
scale_fill_manual(values = c("0" = "#e42524", "1" = "#00ADA9"),
labels = c("0" = "No Exemption", "1" = "With Exemption")) +
scale_color_manual(values = c("0" = "#e42524", "1" = "#00ADA9")) +
labs(title = "Rental Licenses Metrics by Homestead Exemption Status",
x = "Rental Licenses",
y = "Count",
fill = "Exemption Status") +
theme_minimal() +
theme(legend.position = "none")
#join to dataset
residential_properties <- residential_properties %>%
left_join(properties_rental%>%
st_drop_geometry()%>%
select(objectid,rental_license),
by = c("objectid" = "objectid"))
commercial_license<-business_license%>%
filter(licensestatus %in% c("Active"))%>%
filter(licensetype %in% c(
"Food Caterer",
"Food Establishment, Retail Perm Location (Large)",
"Food Establishment, Retail Permanent Location",
"Food Manufacturer / Wholesaler",
"Food Preparing and Serving",
"Food Preparing and Serving (30+ SEATS)",
"Motor Vehicle Repair / Retail Mobile Dispensing",
"Pawn Shop",
"Precious Metal Dealer",
"Public Garage / Parking Lot",
"Residential Property Wholesaler",
"Tire Dealer",
"Tow Company",
"Vacant Commercial Property"
))%>%
select(opa_account_num)%>%
st_drop_geometry()%>%
distinct() %>%
filter(!is.na(opa_account_num)) %>%
mutate(commercial_license = 1)
properties_commercial <- properties_sf %>%
mutate(parcel_number = as.character(parcel_number))%>%
left_join(commercial_license, by = c("parcel_number" = "opa_account_num"))%>%
mutate(commercial_license = replace_na(commercial_license, 0))
ggplot(properties_commercial, aes(x = factor(commercial_license), fill = factor(exemption))) +
geom_bar(position = "dodge") +
geom_text(stat = "count", aes(label = ..count.., color = factor(exemption)),
position = position_dodge(width = 0.9),
vjust = -0.5,
size = 3) +
scale_fill_manual(values = c("0" = "#e42524", "1" = "#00ADA9"),
labels = c("0" = "No Exemption", "1" = "With Exemption")) +
scale_color_manual(values = c("0" = "#e42524", "1" = "#00ADA9")) +
labs(title = "Commercial Licenses Metrics by Homestead Exemption Status",
x = "Commercial Licenses (Exclude Rental)",
y = "Count",
fill = "Exemption Status") +
theme_minimal() +
theme(legend.position = "none")
#join to dataset
residential_properties <- residential_properties %>%
left_join(properties_commercial%>%
st_drop_geometry()%>%
select(objectid,commercial_license),
by = c("objectid" = "objectid"))
Tax balance represents the total tax billing in a census tract, including principal, penalties, and interest from previous years. While not explicitly stated in the eligibility criteria for the Homestead Exemption, it may affect homeowners’ trust and willingness to apply, influencing the possibility of approval. Additionally, the total tax balance in a census tract can serve as an indicator of broader socioeconomic characteristics, such as income levels, educational attainment, and English proficiency, all of which may impact outreach efforts for the Homestead Exemption program.
In our model, we include two tax balance-related predictor variables: (1) the total tax balance of the census tract where a property is located, and (2) the percentage of properties that owe tax balances of the census tract where a property is located.
balances <- read.csv("Data/real_estate_tax_balances_census_tract.csv")
balance_sf <- census_tracts_enriched %>%
left_join(balances %>% select(census_tract,balance,num_props),
by = c("GEOID" = "census_tract"))
ggplot(balance_sf) +
geom_sf(aes(fill = balance), color = "white", size = 0.1) +
scale_fill_gradientn(colors = c("#00ADA9","#e3f9f7","#f4aa9e", "#e42524"),
limits = range(balance_sf$balance, na.rm = TRUE),
breaks = range(balance_sf$balance, na.rm = TRUE)
) +
labs(title = "Total Tax Balance by Census Tract",
fill = "Price") +
theme_minimal() +
theme(
panel.grid = element_blank(),
plot.title = element_text(size = 14),
plot.subtitle = element_text(size = 6),
axis.title = element_text(size = 6),
axis.text = element_blank(),
legend.text = element_text(size = 6),
legend.position = "bottom",
legend.direction = "horizontal"
)
# Visualize the relationship
ggplot(balance_sf, aes(x = balance, y = pct_homestead)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "loess") +
labs(
title = "Tax Balance vs. Homestead Participation",
x = "Total Principle ($)",
y = "Homestead Participation Rate (%)"
) +
theme_minimal()
#distinct
balance_distinct <- balances %>%
mutate(census = as.numeric(substr(census_tract, 6, 9)))%>%
st_drop_geometry()%>%
group_by(census) %>%
summarise(balance_avg=sum(balance,na.rm=TRUE)/sum(num_props, na.rm = TRUE),
balance_total=sum(balance,na.rm=TRUE),
tax_props=sum(num_props,na.rm=TRUE)) %>%
ungroup()
properties_number<-properties_sf%>%
select(census_tract,exemption,parcel_number)%>%
mutate(prop=1)%>%
st_drop_geometry()%>%
group_by(census_tract)%>%
summarise(total_props=sum(prop))%>%
ungroup()
properties_sf_number<-properties_sf%>%
left_join(properties_number,by="census_tract")
properties_balance <- properties_sf_number %>%
select(objectid, census_tract,exemption,parcel_number,total_props)%>%
left_join(balance_distinct, by = c("census_tract" = "census"))%>%
mutate(balance_avg = replace(balance_avg, is.na(balance_avg), 0),
balance_total = replace(balance_total, is.na(balance_total), 0),
tax_props=replace(tax_props,is.na(tax_props),0))%>%
mutate(balance_rate=tax_props/total_props)
avg_values <- properties_balance %>%
st_drop_geometry()%>%
group_by(exemption) %>%
summarise(
'Total Tax Balance (In 100,000)' = mean(balance_total, na.rm = TRUE)/100000,
#Tax_Props=mean(tax_props,na.rm=TRUE),
'% of Properties with Tax Balance'=mean(balance_rate,na.rm=TRUE)*100
) %>%
pivot_longer(cols = -exemption, names_to = "variable", values_to = "mean_value")
ggplot(avg_values, aes(x = variable, y = mean_value, fill = as.factor(exemption))) +
geom_bar(stat = "identity", width = 0.5, position = position_dodge(width = 0.6)) +
geom_text(aes(label = round(mean_value, 4), color = as.factor(exemption)),
position = position_dodge(width = 0.6),
vjust = -0.5, size = 5) +
labs(title = "Mean of Tax Variables by Exemption Status",
x = "Metrics",
y = "Mean Value",
fill = "Exemption Status") +
theme_minimal() +
theme(
plot.title = element_text(size = 18),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_text(size = 12),
legend.text = element_text(size = 12),
legend.position = "bottom",
legend.direction = "horizontal"
) +
scale_fill_manual(values = c("#e42524", "#00ADA9"), labels = c("No Exemption", "With Exemption")) +
scale_color_manual(values = c("#e42524", "#00ADA9"), guide = "none")
#join to dataset
residential_properties <- residential_properties %>%
left_join(properties_balance%>%
st_drop_geometry()%>%
select(objectid,balance_total,balance_rate),
by = c("objectid" = "objectid"))
This sector contains these variables: - Avg. Market Value and Avg. Taxable Value Continuous (Numerical) - Sd. Market Value and Sd. Taxable Value Continuous (Numerical)
Properties with lower values are more likely to belong to homeowners who qualify for tax relief. Homestead Exemption helps stabilize values by reducing taxable assessments.
Our key findings include: - Exempted properties have lower market values and taxable values compared to non-exempt ones. - Standard deviation of market value is lower for exempted properties, suggesting more stable valuations.
# Load and inspect the assessments dataset
assessments2 <- read.csv("Data/assessments.csv")
colnames(assessments2)
## [1] "parcel_number" "year" "market_value" "taxable_land"
## [5] "taxable_building" "exempt_land" "exempt_building" "objectid"
head(assessments2, 10)
## parcel_number year market_value taxable_land taxable_building exempt_land
## 1 11001000 2020 237000 62094 129906 0
## 2 11001000 2019 218700 57299 121401 0
## 3 11001000 2018 192200 50356 111844 0
## 4 11001000 2017 192200 50356 111844 0
## 5 11001000 2016 192200 30150 132050 0
## 6 11001000 2015 192200 30150 132050 0
## 7 11001100 2025 381300 76260 305040 0
## 8 11001100 2024 339800 67960 271840 0
## 9 11001100 2023 339800 67960 271840 0
## 10 11001100 2022 282300 73963 208337 0
## exempt_building objectid
## 1 45000 2840898049
## 2 40000 2840898050
## 3 30000 2840898051
## 4 30000 2840898052
## 5 30000 2840898053
## 6 30000 2840898054
## 7 0 2840898055
## 8 0 2840898056
## 9 0 2840898057
## 10 0 2840898058
# Merge properties and assessments data by parcel_number
cleaned_properties <- filtered_properties %>%
select(parcel_number, exemption, is_residential, shape)
assessment_combined <- assessments2 %>%
left_join(cleaned_properties, by = "parcel_number")
head(assessment_combined)
## parcel_number year market_value taxable_land taxable_building exempt_land
## 1 11001000 2020 237000 62094 129906 0
## 2 11001000 2019 218700 57299 121401 0
## 3 11001000 2018 192200 50356 111844 0
## 4 11001000 2017 192200 50356 111844 0
## 5 11001000 2016 192200 30150 132050 0
## 6 11001000 2015 192200 30150 132050 0
## exempt_building objectid exemption is_residential
## 1 45000 2840898049 1 1
## 2 40000 2840898050 1 1
## 3 30000 2840898051 1 1
## 4 30000 2840898052 1 1
## 5 30000 2840898053 1 1
## 6 30000 2840898054 1 1
## shape
## 1 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 2 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 3 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 4 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 5 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 6 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
# Filter Residential Data
# Filter the combined dataset to include only residential properties
residential_assessment_combined <- assessment_combined %>%
filter(is_residential == 1)
head(residential_assessment_combined)
## parcel_number year market_value taxable_land taxable_building exempt_land
## 1 11001000 2020 237000 62094 129906 0
## 2 11001000 2019 218700 57299 121401 0
## 3 11001000 2018 192200 50356 111844 0
## 4 11001000 2017 192200 50356 111844 0
## 5 11001000 2016 192200 30150 132050 0
## 6 11001000 2015 192200 30150 132050 0
## exempt_building objectid exemption is_residential
## 1 45000 2840898049 1 1
## 2 40000 2840898050 1 1
## 3 30000 2840898051 1 1
## 4 30000 2840898052 1 1
## 5 30000 2840898053 1 1
## 6 30000 2840898054 1 1
## shape
## 1 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 2 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 3 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 4 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 5 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
## 6 SRID=2272;POINT ( 2698365.44997206 228564.73242714)
colnames(residential_assessment_combined)
## [1] "parcel_number" "year" "market_value" "taxable_land"
## [5] "taxable_building" "exempt_land" "exempt_building" "objectid"
## [9] "exemption" "is_residential" "shape"
# Market Value Growth Rate by Exemption Status (2016-2025)
# Calculate yearly market value and growth rate by exemption status
yearly_market_value <- residential_assessment_combined %>%
group_by(year, exemption) %>%
summarise(total_market_value = sum(market_value, na.rm = TRUE), .groups = 'drop')
yearly_market_value <- yearly_market_value %>%
arrange(exemption, year) %>%
group_by(exemption) %>%
mutate(market_value_growth_rate = (total_market_value - lag(total_market_value)) / lag(total_market_value) * 100)
# Filter data for the years 2016-2025
yearly_market_value_filtered <- yearly_market_value %>%
filter(year >= 2016 & year <= 2025)
# Plot the market value growth rate
ggplot(yearly_market_value_filtered, aes(x = year, y = market_value_growth_rate, color = as.factor(exemption))) +
geom_line(size = 1.2) +
geom_point(size = 2) +
scale_x_continuous(breaks = seq(2016, 2025, by = 1)) +
scale_y_continuous(labels = percent_format(scale = 1)) +
scale_color_manual(values = c("0" = "#E42524", "1" = "#00ADA9"), labels = c("No Exemption", "With Exemption")) +
labs(title = "Market Value Growth Rate (2016-2025)",
x = "Year",
y = "Growth Rate (%)",
color = "Exemption Status") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
print(yearly_market_value_filtered)
## # A tibble: 20 × 4
## # Groups: exemption [2]
## year exemption total_market_value market_value_growth_rate
## <int> <dbl> <dbl> <dbl>
## 1 2016 0 68303720915 1.60
## 2 2017 0 70567844512 3.31
## 3 2018 0 80287997319 13.8
## 4 2019 0 88179071992 9.83
## 5 2020 0 93001545386 5.47
## 6 2021 0 94757821078 1.89
## 7 2022 0 96509983650 1.85
## 8 2023 0 114634228622 18.8
## 9 2024 0 116744008982 1.84
## 10 2025 0 129886267715 11.3
## 11 2016 1 38156905833 0.708
## 12 2017 1 38443943033 0.752
## 13 2018 1 38667839100 0.582
## 14 2019 1 43364362090 12.1
## 15 2020 1 44984523550 3.74
## 16 2021 1 44981876450 -0.00588
## 17 2022 1 45009628972 0.0617
## 18 2023 1 56848412909 26.3
## 19 2024 1 56869572229 0.0372
## 20 2025 1 68185163435 19.9
# Calculate average growth rate by exemption status
growth_comparison <- yearly_market_value_filtered %>%
group_by(exemption) %>%
summarise(market_value_growth_rate = mean(market_value_growth_rate, na.rm = TRUE))
print(growth_comparison)
## # A tibble: 2 × 2
## exemption market_value_growth_rate
## <dbl> <dbl>
## 1 0 6.96
## 2 1 6.42
Mean values for market_value, taxable_land, taxable_building, exempt_land, and exempt_building. Growth rate (average annual growth). Standard deviation (to measure volatility).
# Summarize key metrics for each parcel: mean, growth rate, and standard deviation
residential_summary <- residential_assessment_combined %>%
group_by(parcel_number) %>%
summarise(
avg_market_value = mean(market_value, na.rm = TRUE),
avg_taxable_land = mean(taxable_land, na.rm = TRUE),
avg_taxable_building = mean(taxable_building, na.rm = TRUE),
avg_exempt_land = mean(exempt_land, na.rm = TRUE),
avg_exempt_building = mean(exempt_building, na.rm = TRUE),
growth_market_value = (last(market_value) - first(market_value)) / first(market_value) * 100,
sd_market_value = sd(market_value, na.rm = TRUE),
sd_taxable_land = sd(taxable_land, na.rm = TRUE),
sd_taxable_building = sd(taxable_building, na.rm = TRUE)
)
# Add exemption and residential status to the summary dataset
residential_summary <- residential_summary %>%
left_join(
residential_assessment_combined %>%
select(parcel_number, exemption, is_residential, shape) %>%
distinct(parcel_number, .keep_all = TRUE), # Keep one row per parcel_number
by = "parcel_number"
)
head(residential_summary)
## # A tibble: 6 × 13
## parcel_number avg_market_value avg_taxable_land avg_taxable_building
## <dbl> <dbl> <dbl> <dbl>
## 1 11000001 112700 112700 0
## 2 11000002 106600 106600 0
## 3 11000003 106600 106600 0
## 4 11000004 90333. 90333. 0
## 5 11000005 106600 106600 0
## 6 11000006 144533. 144533. 0
## # ℹ 9 more variables: avg_exempt_land <dbl>, avg_exempt_building <dbl>,
## # growth_market_value <dbl>, sd_market_value <dbl>, sd_taxable_land <dbl>,
## # sd_taxable_building <dbl>, exemption <dbl>, is_residential <dbl>,
## # shape <chr>
# Analysis by Exemption Status
# Summarize metrics by exemption status
residential_summary_analysis <- residential_summary %>%
group_by(exemption) %>%
summarise(
avg_market_value = mean(avg_market_value, na.rm = TRUE),
avg_taxable_land = mean(avg_taxable_land, na.rm = TRUE),
avg_taxable_building = mean(avg_taxable_building, na.rm = TRUE),
avg_exempt_land = mean(avg_exempt_land, na.rm = TRUE),
avg_exempt_building = mean(avg_exempt_building, na.rm = TRUE),
sd_market_value = mean(sd_market_value, na.rm = TRUE),
sd_taxable_land = mean(sd_taxable_land, na.rm = TRUE),
sd_taxable_building = mean(sd_taxable_building, na.rm = TRUE)
)
# Convert the summary data to long format for visualization
library(tidyr)
summary_residential_long <- residential_summary_analysis %>%
pivot_longer(
cols = -exemption,
names_to = "metric",
values_to = "value"
)
# Plot residential property metrics by exemption status
library(ggplot2)
custom_colors <- c("0" = "#E42524",
"1" = "#00ADA9")
ggplot(summary_residential_long, aes(x = metric, y = value, fill = as.factor(exemption))) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_manual(values = custom_colors, labels = c("No Exemption", "With Exemption")) +
labs(title = "Residential Property Metrics by Homestead Exemption Status (2015-2025)",
x = "Metric",
y = "Value",
fill = "Exemption Status") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
#Plot Only "avg_market_value" and "sd_market_value"
summary_filtered <- summary_residential_long %>%
filter(metric %in% c("avg_market_value", "sd_market_value"))
custom_colors <- c("0" = "#E42524",
"1" = "#00ADA9")
ggplot(summary_filtered, aes(x = metric, y = value, fill = as.factor(exemption), color = as.factor(exemption))) +
geom_bar(stat = "identity", position = position_dodge(width = 0.7), width = 0.6, alpha = 0.85) +
geom_text(aes(label = round(value, 0)),
position = position_dodge(width = 0.7),
vjust = -0.5, size = 4.5, fontface = "bold") +
ylim(0, max(summary_filtered$value) * 1.2) +
scale_fill_manual(values = custom_colors, labels = c("No Exemption", "With Exemption")) +
scale_color_manual(values = custom_colors, guide = "none") +
labs(title = "Average and Standard Deviation of Market Value (2015-2025)",
subtitle = "Comparison of Properties With and Without Exemption",
x = "Metric",
y = "Value ($)",
fill = "Exemption Status") +
theme_minimal(base_size = 14) +
theme(axis.text.x = element_text(angle = 0, vjust = 0.5, hjust = 0.5, face = "bold"),
axis.title = element_text(face = "bold"),
legend.position = "top",
plot.title = element_text(face = "bold", size = 16),
plot.subtitle = element_text(size = 13, color = "gray40"))
Predictive Data to Use: - Taxable Land & Taxable Building Value Continuous (Numerical) - Yearly Growth Rate(Taxable Land & Taxable Building) Continuous (Numerical) - If there is sudden changes (growth rate above avg.) Binomial (0 = No sudden change, 1 = Sudden change)
Key Findings: - Exempt land values remain consistently low, while non-exempt land values increase more significantly over time. - Taxable building values show a large gap between exempt and non-exempt properties, with non-exempt properties rising much more.
Possible Reason: Exempted properties are less subject to market-driven tax hikes, while non-exempt ones see faster appreciation and tax increases.
yearly_residential_data <- assessments2 %>%
inner_join(cleaned_properties, by = "parcel_number") %>%
filter(is_residential == 1) # Keep only residential properties
# Summarize yearly residential data by exemption status
yearly_residential_summary <- yearly_residential_data %>%
group_by(year, exemption) %>%
summarise(
market_value = mean(market_value, na.rm = TRUE),
taxable_land = mean(taxable_land, na.rm = TRUE),
taxable_building = mean(taxable_building, na.rm = TRUE),
exempt_land = mean(exempt_land, na.rm = TRUE),
exempt_building = mean(exempt_building, na.rm = TRUE),
.groups = "drop"
)
# Plot yearly changes in residential metrics
library(ggplot2)
library(tidyr)
# Convert the data to long format for better visualization
yearly_residential_long <- yearly_residential_summary %>%
pivot_longer(
cols = c(market_value, taxable_land, taxable_building, exempt_land, exempt_building),
names_to = "metric",
values_to = "value"
)
# Filter data for years 2015 and later
yearly_residential_long_filtered <- yearly_residential_long %>%
filter(year >= 2015)
custom_colors <- c("0" = "#E42524", # Red (No Exemption)
"1" = "#00ADA9") # Green (With Exemption)
# Generate individual plots for each metric
plots <- yearly_residential_long_filtered %>%
split(.$metric) %>%
lapply(function(df) {
ggplot(df, aes(x = year, y = value, color = as.factor(exemption))) +
geom_line(size = 1.2) +
geom_point(size = 2) +
scale_x_continuous(breaks = seq(2015, max(df$year), by = 1)) +
scale_color_manual(values = custom_colors, labels = c("No Exemption", "With Exemption")) +
labs(title = paste("Yearly Change in", unique(df$metric)),
x = "Year",
y = "Value",
color = "Exemption Status") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
})
for (p in plots) {
print(p)
}
Key Findings: - Certain deed types (e.g., Deed - Deceased, Satisfaction of Mortgage, Land Bank Deed) have a higher exemption rate. Possible Reason: - These transactions indicate long-term ownership, inheritance, or financial restructuring, which may qualify properties for exemption. Predictive Data to Use: - Property deed type Categorical (Nominal)
transfers2 <- read.csv("Data/RTT_SUMMARY.csv")
residential_transfers <- residential_properties %>%
left_join(transfers2, by = c("parcel_number" = "opa_account_num"))
# Deed Type Analysis
residential_transfers <- residential_transfers %>%
mutate(document_type = ifelse(is.na(document_type), "Unknown", document_type))
exemption_by_document_type <- residential_transfers %>%
group_by(document_type) %>%
summarise(
total_count = n(),
exemption_count = sum(exemption == 1),
exemption_proportion = exemption_count / total_count * 100
) %>%
arrange(desc(exemption_proportion))
print(exemption_by_document_type)
## # A tibble: 30 × 4
## document_type total_count exemption_count exemption_proportion
## <chr> <int> <int> <dbl>
## 1 "NOTARY COMMISSION" 1 1 100
## 2 "DEED - DECEASED " 217 145 66.8
## 3 "SATISFACTION OF MORTGAGE" 199081 99458 50.0
## 4 "MORTGAGE" 217667 100540 46.2
## 5 "Unknown" 302682 129494 42.8
## 6 "ALL OTHER MISCELLANEOUS IN… 75 32 42.7
## 7 "ASSIGNMENT OF MORTGAGE" 57498 23583 41.0
## 8 "POWER OF ATTORNEY" 2859 1042 36.4
## 9 "DECLARATION OF PLANNED COM… 230 78 33.9
## 10 "DEED - ADVERSE POSSESSION" 3 1 33.3
## # ℹ 20 more rows
## binomial model:DEED - DECEASED, SATISFACTION OF MORTGAGE, MORTGAGE, ASSIGNMENT OF MORTGAGE
document_exemption_model <- glm(exemption ~ document_type, data = residential_transfers, family = binomial)
summary(document_exemption_model)
##
## Call:
## glm(formula = exemption ~ document_type, family = binomial, data = residential_transfers)
##
## Coefficients:
## Estimate Std. Error
## (Intercept) -1.48638 0.25404
## document_typeALL OTHER MISCELLANEOUS INSTRUMENTS 1.19091 0.34502
## document_typeAMENDMENT -0.77240 0.28647
## document_typeAMENDMENT TO DECLARATION OF CONDOMINIUM -0.19468 0.26273
## document_typeAMENDMENT TO DECLARATION OF PLANNED COMMUNITY -4.39121 1.02484
## document_typeASSIGNMENT -2.63044 0.33276
## document_typeASSIGNMENT OF MORTGAGE 1.12305 0.25418
## document_typeCERTIFICATE OF STOCK TRANSFER -2.85526 0.48309
## document_typeCONTINUATION -0.82003 0.26309
## document_typeDECLARATION OF CONDOMINIUM -3.82189 0.51497
## document_typeDECLARATION OF PLANNED COMMUNITY 0.81921 0.28972
## document_typeDEED 0.76405 0.25409
## document_typeDEED - ADVERSE POSSESSION 0.79323 1.25081
## document_typeDEED - DECEASED 2.18645 0.29210
## document_typeDEED LAND BANK -3.49996 0.51560
## document_typeDEED OF CONDEMNATION -1.15268 0.44549
## document_typeDEED RTT - OTHER -0.10552 0.34185
## document_typeDM - LIS PENDENS -0.09554 0.27046
## document_typeMISCELLANEOUS DEED -0.21891 0.25429
## document_typeMISCELLANEOUS DEED TAXABLE -3.29275 0.75412
## document_typeMORTGAGE 1.33367 0.25408
## document_typeNOTARY COMMISSION 10.05214 43.95469
## document_typeORIGINAL FINANCING STATEMENT -0.04482 0.25502
## document_typePOWER OF ATTORNEY 0.93033 0.25699
## document_typeRELEASE -0.86500 0.78240
## document_typeRELEASE OF MORTGAGE -0.26764 0.25673
## document_typeSATISFACTION OF MORTGAGE 1.48472 0.25408
## document_typeSHERIFF'S DEED -0.87434 0.26192
## document_typeTERMINATION -0.21761 0.25728
## document_typeUnknown 1.19563 0.25407
## z value
## (Intercept) -5.851
## document_typeALL OTHER MISCELLANEOUS INSTRUMENTS 3.452
## document_typeAMENDMENT -2.696
## document_typeAMENDMENT TO DECLARATION OF CONDOMINIUM -0.741
## document_typeAMENDMENT TO DECLARATION OF PLANNED COMMUNITY -4.285
## document_typeASSIGNMENT -7.905
## document_typeASSIGNMENT OF MORTGAGE 4.418
## document_typeCERTIFICATE OF STOCK TRANSFER -5.910
## document_typeCONTINUATION -3.117
## document_typeDECLARATION OF CONDOMINIUM -7.422
## document_typeDECLARATION OF PLANNED COMMUNITY 2.828
## document_typeDEED 3.007
## document_typeDEED - ADVERSE POSSESSION 0.634
## document_typeDEED - DECEASED 7.485
## document_typeDEED LAND BANK -6.788
## document_typeDEED OF CONDEMNATION -2.587
## document_typeDEED RTT - OTHER -0.309
## document_typeDM - LIS PENDENS -0.353
## document_typeMISCELLANEOUS DEED -0.861
## document_typeMISCELLANEOUS DEED TAXABLE -4.366
## document_typeMORTGAGE 5.249
## document_typeNOTARY COMMISSION 0.229
## document_typeORIGINAL FINANCING STATEMENT -0.176
## document_typePOWER OF ATTORNEY 3.620
## document_typeRELEASE -1.106
## document_typeRELEASE OF MORTGAGE -1.043
## document_typeSATISFACTION OF MORTGAGE 5.844
## document_typeSHERIFF'S DEED -3.338
## document_typeTERMINATION -0.846
## document_typeUnknown 4.706
## Pr(>|z|)
## (Intercept) 0.00000000488747010
## document_typeALL OTHER MISCELLANEOUS INSTRUMENTS 0.000557
## document_typeAMENDMENT 0.007013
## document_typeAMENDMENT TO DECLARATION OF CONDOMINIUM 0.458718
## document_typeAMENDMENT TO DECLARATION OF PLANNED COMMUNITY 0.00001829077280808
## document_typeASSIGNMENT 0.00000000000000268
## document_typeASSIGNMENT OF MORTGAGE 0.00000994866804138
## document_typeCERTIFICATE OF STOCK TRANSFER 0.00000000341148639
## document_typeCONTINUATION 0.001827
## document_typeDECLARATION OF CONDOMINIUM 0.00000000000011571
## document_typeDECLARATION OF PLANNED COMMUNITY 0.004690
## document_typeDEED 0.002639
## document_typeDEED - ADVERSE POSSESSION 0.525969
## document_typeDEED - DECEASED 0.00000000000007139
## document_typeDEED LAND BANK 0.00000000001136266
## document_typeDEED OF CONDEMNATION 0.009670
## document_typeDEED RTT - OTHER 0.757582
## document_typeDM - LIS PENDENS 0.723894
## document_typeMISCELLANEOUS DEED 0.389303
## document_typeMISCELLANEOUS DEED TAXABLE 0.00001263599788511
## document_typeMORTGAGE 0.00000015283707518
## document_typeNOTARY COMMISSION 0.819107
## document_typeORIGINAL FINANCING STATEMENT 0.860491
## document_typePOWER OF ATTORNEY 0.000295
## document_typeRELEASE 0.268915
## document_typeRELEASE OF MORTGAGE 0.297172
## document_typeSATISFACTION OF MORTGAGE 0.00000000511078389
## document_typeSHERIFF'S DEED 0.000843
## document_typeTERMINATION 0.397663
## document_typeUnknown 0.00000252637981775
##
## (Intercept) ***
## document_typeALL OTHER MISCELLANEOUS INSTRUMENTS ***
## document_typeAMENDMENT **
## document_typeAMENDMENT TO DECLARATION OF CONDOMINIUM
## document_typeAMENDMENT TO DECLARATION OF PLANNED COMMUNITY ***
## document_typeASSIGNMENT ***
## document_typeASSIGNMENT OF MORTGAGE ***
## document_typeCERTIFICATE OF STOCK TRANSFER ***
## document_typeCONTINUATION **
## document_typeDECLARATION OF CONDOMINIUM ***
## document_typeDECLARATION OF PLANNED COMMUNITY **
## document_typeDEED **
## document_typeDEED - ADVERSE POSSESSION
## document_typeDEED - DECEASED ***
## document_typeDEED LAND BANK ***
## document_typeDEED OF CONDEMNATION **
## document_typeDEED RTT - OTHER
## document_typeDM - LIS PENDENS
## document_typeMISCELLANEOUS DEED
## document_typeMISCELLANEOUS DEED TAXABLE ***
## document_typeMORTGAGE ***
## document_typeNOTARY COMMISSION
## document_typeORIGINAL FINANCING STATEMENT
## document_typePOWER OF ATTORNEY ***
## document_typeRELEASE
## document_typeRELEASE OF MORTGAGE
## document_typeSATISFACTION OF MORTGAGE ***
## document_typeSHERIFF'S DEED ***
## document_typeTERMINATION
## document_typeUnknown ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 1408886 on 1043658 degrees of freedom
## Residual deviance: 1360982 on 1043629 degrees of freedom
## AIC: 1361042
##
## Number of Fisher Scoring iterations: 7
Key Findings: - A significant proportion (39.7%) of exempt properties involved minimal financial consideration transfers (≤ $10). Possible Reason: - These transfers often occur in family transactions, estate planning, or financial hardship cases, aligning with exemption criteria. Predictive Data to Use: - Total transaction value Continuous (Numerical) - if the properties have minimal financial consideration transfers Binomial (0 = No, 1 = Yes)
# Minimal Financial Consideration Analysis
minimal_threshold <- 10
minimal_financial_transfers <- residential_transfers %>%
filter(total_consideration <= minimal_threshold)
exemption_summary <- minimal_financial_transfers %>%
group_by(exemption) %>%
summarise(
count = n(),
proportion = count / nrow(minimal_financial_transfers) * 100
)
print(exemption_summary)
## # A tibble: 2 × 3
## exemption count proportion
## <dbl> <int> <dbl>
## 1 0 279647 59.2
## 2 1 192950 40.8
#Recent(2y) Transfer Analysis
library(dplyr)
library(lubridate)
# Convert recording_date to Date format
residential_transfers <- residential_transfers %>%
mutate(recording_date = as.Date(recording_date.x, format="%Y-%m-%d"))
# Get the current year
current_year <- year(Sys.Date())
# Filter and summarize recent transfers
residential_transfers_2y <- residential_transfers %>%
filter(!is.na(recording_date)) %>%
group_by(parcel_number) %>%
summarise(
latest_transfer_year = max(year(recording_date), na.rm = TRUE), # Most recent transfer year
exemption = first(exemption), # Retain exemption status
.groups = "drop"
) %>%
mutate(
latest_transfer_year = ifelse(is.infinite(latest_transfer_year), NA, latest_transfer_year) # Handle infinite values
)
# Mark properties with recent transfers (within 2 years)
residential_transfers_2y <- residential_transfers_2y %>%
mutate(has_recent_transfer = ifelse(!is.na(latest_transfer_year) & latest_transfer_year >= (current_year - 2), 1, 0))
# View results
print(head(residential_transfers_2y))
## # A tibble: 6 × 4
## parcel_number latest_transfer_year exemption has_recent_transfer
## <dbl> <int> <dbl> <dbl>
## 1 11000001 2021 0 0
## 2 11000002 2021 0 0
## 3 11000003 2021 0 0
## 4 11000004 2021 0 0
## 5 11000006 2021 0 0
## 6 11000010 2021 0 0
# Calculate recent transfer rates by exemption status
exemption_transfer_analysis_residential <- residential_transfers_2y %>%
group_by(exemption) %>%
summarise(
avg_recent_transfer = mean(has_recent_transfer, na.rm = TRUE) * 100 # Convert to percentage
)
# View results
print(exemption_transfer_analysis_residential)
## # A tibble: 2 × 2
## exemption avg_recent_transfer
## <dbl> <dbl>
## 1 0 9.52
## 2 1 4.24
On average, 20.07% of properties without a homestead exemption had a transfer in the past two years, compared to 16.17% of those with an exemption. This indicates that properties without exemptions tend to have a slightly higher likelihood of recent transactions.
# Create a contingency table for exemption and recent transfers
exemption_transfer_table <- table(residential_transfers_2y$exemption, residential_transfers_2y$has_recent_transfer)
# Perform chi-square test
chi_test_result <- chisq.test(exemption_transfer_table)
# View test results
print(chi_test_result)
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: exemption_transfer_table
## X-squared = 5721.6, df = 1, p-value < 0.00000000000000022
# Scatterplot of homestead rates vs median home values
ggplot(census_tracts_enriched %>%
filter(pop_density > 0),
aes(x = median_home_value, y = pct_homestead)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "loess", se = TRUE) +
scale_x_continuous(labels = scales::dollar_format()) +
theme_minimal() +
labs(
title = "Homestead Exemption Rates vs. Home Values",
x = "Median Home Value",
y = "Percentage with Homestead Exemption"
)
#use liner model =lm
# Create the linear model
homestead_model <- lm(pct_homestead ~ median_home_value,
data = census_tracts_enriched %>%
filter(pop_density > 0))
# View the summary statistics
summary(homestead_model)
##
## Call:
## lm(formula = pct_homestead ~ median_home_value, data = census_tracts_enriched %>%
## filter(pop_density > 0))
##
## Residuals:
## Min 1Q Median 3Q Max
## -41.514 -13.480 -0.905 11.873 42.103
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.793398855 1.606096340 26.644 <0.0000000000000002 ***
## median_home_value 0.000003141 0.000005130 0.612 0.541
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 17.05 on 374 degrees of freedom
## (15 observations deleted due to missingness)
## Multiple R-squared: 0.001001, Adjusted R-squared: -0.00167
## F-statistic: 0.3749 on 1 and 374 DF, p-value: 0.5407
# Visualize with linear fit instead of loess
ggplot(census_tracts_enriched %>%
filter(pop_density > 0),
aes(x = median_home_value, y = pct_homestead)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE) + # Changed from loess to lm
scale_x_continuous(labels = scales::dollar_format()) +
theme_minimal() +
labs(
title = "Linear Relationship: Homestead Exemption Rates vs. Home Values",
x = "Median Home Value",
y = "Percentage with Homestead Exemption"
)
The scatterplot shows the relationship between median home values (x-axis) and homestead exemption rates (y-axis) across Philadelphia census tracts. The pattern suggests:
Homestead exemption rates increase with home values up to around $250,000.
Peak participation occurs in the $200,000-$300,000 range (around 50%).
There’s a slight decline in participation for higher-value homes.
The widening gray area at higher home values indicates more uncertainty in the trend, likely due to fewer data points in that range.
Wide variation in participation rates across all home values, shown by the vertical spread of points
# Owner Occupancy vs Homestead Rates Analysis
occupancy_analysis <- census_tracts_enriched %>%
filter(pop_density > 0) %>%
mutate(
owner_occ_rate = (owner_hh / occupied_units) * 100,
pct_homestead_owners = (homestead_count / owner_hh) * 100
) %>%
select(GEOID, owner_occ_rate, pct_homestead_owners)
# Visualize the relationship
ggplot(occupancy_analysis, aes(x = owner_occ_rate, y = pct_homestead_owners)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "loess") +
labs(
title = "Owner Occupancy Rate vs. Homestead Participation",
x = "Owner Occupancy Rate (%)",
y = "Homestead Participation Rate (%)"
) +
theme_minimal()
ggsave("outputs/homestead-exemption-distribution.png", census_hist, width = 10, height = 6)
This scatter plot reveals important patterns in homestead exemption participation across Philadelphia’s neighborhoods. By comparing census tract data on owner occupancy rates (from the 2022 5-year ACS) with homestead exemption enrollment, we can identify areas where participation could be improved.
The data shows that while most Philadelphia census tracts have owner occupancy rates between 25-75%, and generally over half of eligible homeowners participate in the program, there are clear opportunities for improvement. Particularly concerning are:
Census tracts with participation rates below 50%
Areas with high owner occupancy but low program participation
Neighborhoods falling well below the expected participation rate (shown by the blue trend line)
Notably, higher rates of owner occupancy don’t automatically translate to higher program participation. This suggests that other factors beyond home ownership - such as awareness of the program, ease of enrollment, or demographic characteristics - may play more significant roles in determining participation rates. These insights can help guide targeted outreach efforts to increase program enrollment among eligible homeowners who are currently missing out on this tax benefit.
# Create the dataset with the calculated rates
occupancy_analysis <- census_tracts_enriched %>%
filter(pop_density > 0) %>%
mutate(
owner_occ_rate = (owner_hh / occupied_units) * 100,
pct_homestead_owners = (homestead_count / owner_hh) * 100
) %>%
select(GEOID, owner_occ_rate, pct_homestead_owners)
occupancy_analysis <- occupancy_analysis %>%
filter(!is.na(pct_homestead_owners) & !is.nan(pct_homestead_owners) & !is.infinite(pct_homestead_owners))
# Fit the linear model
occupancy_model <- lm(pct_homestead_owners ~ owner_occ_rate, data = occupancy_analysis)
# View model summary
summary(occupancy_model)
##
## Call:
## lm(formula = pct_homestead_owners ~ owner_occ_rate, data = occupancy_analysis)
##
## Residuals:
## Min 1Q Median 3Q Max
## -69.127 -11.116 0.669 11.707 126.499
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 77.27015 3.00895 25.680 <0.0000000000000002 ***
## owner_occ_rate -0.11459 0.05411 -2.118 0.0348 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 20.61 on 381 degrees of freedom
## Multiple R-squared: 0.01164, Adjusted R-squared: 0.009042
## F-statistic: 4.485 on 1 and 381 DF, p-value: 0.03483
# Visualize with linear fit
ggplot(occupancy_analysis, aes(x = owner_occ_rate, y = pct_homestead_owners)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE) + # Changed from loess to lm
labs(
title = "Linear Relationship: Owner Occupancy vs. Homestead Participation",
x = "Owner Occupancy Rate (%)",
y = "Homestead Participation Rate (%)"
) +
theme_minimal()
# Data frame of zoning types
zoning_types <- data.frame(
Type = c(
"Single Family Detached",
"Single Family Attached",
"Two-Family Attached",
"Multi-Family",
"Residential Mixed-Use",
"Commercial Mixed-Use",
"Industrial Residential Mixed-Use"
),
Codes = c(
"RSD1, RSD2, RSD3",
"RSA1, RSA2, RSA3, RSA4, RSA5, RSA6",
"RTA1",
"RM1, RM2, RM3, RM4",
"RMX1, RMX2, RMX3",
"CMX1, CMX2, CMX2.5, CMX3, CMX4, CMX5",
"IRMX"
),
Description = c(
"Detached houses on individual lots",
"Attached and semi-detached houses on individual lots",
"Two-family, semi-detached houses on individual lots",
"Moderate to high-density multi-unit residential buildings",
"Residential and mixed-use development, including master plan development",
"Neighborhood to regional-serving mixed-use development",
"Mix of low-impact industrial, artisan industrial, residential, and neighborhood commercial uses"
)
)
# Formatted table
kable(zoning_types,
col.names = c("Residential Type", "Zoning Codes", "Description"),
caption = "Philadelphia Residential Zoning Classifications")
| Residential Type | Zoning Codes | Description |
|---|---|---|
| Single Family Detached | RSD1, RSD2, RSD3 | Detached houses on individual lots |
| Single Family Attached | RSA1, RSA2, RSA3, RSA4, RSA5, RSA6 | Attached and semi-detached houses on individual lots |
| Two-Family Attached | RTA1 | Two-family, semi-detached houses on individual lots |
| Multi-Family | RM1, RM2, RM3, RM4 | Moderate to high-density multi-unit residential buildings |
| Residential Mixed-Use | RMX1, RMX2, RMX3 | Residential and mixed-use development, including master plan development |
| Commercial Mixed-Use | CMX1, CMX2, CMX2.5, CMX3, CMX4, CMX5 | Neighborhood to regional-serving mixed-use development |
| Industrial Residential Mixed-Use | IRMX | Mix of low-impact industrial, artisan industrial, residential, and neighborhood commercial uses |
properties_filtered <- filtered_properties %>%
select(
zoning,
homestead_exemption,
is_residential,
census_tract,
shape
)
# First create a zoning type classification
filtered_properties <- filtered_properties %>%
mutate(
zoning_type = case_when(
zoning %in% c("RSD1", "RSD2", "RSD3") ~ "Single Family Detached",
zoning %in% c("RSA1", "RSA2", "RSA3", "RSA4", "RSA5", "RSA6") ~ "Single Family Attached",
zoning %in% c("RTA1") ~ "Two-Family Attached",
zoning %in% c("RM1", "RM2", "RM3", "RM4") ~ "Multi-Family",
zoning %in% c("RMX1", "RMX2", "RMX3") ~ "Residential Mixed-Use",
zoning %in% c("CMX1", "CMX2", "CMX2.5", "CMX3", "CMX4", "CMX5") ~ "Commercial Mixed-Use",
zoning %in% c("IRMX") ~ "Industrial Residential Mixed-Use",
TRUE ~ "Other"
),
is_residential = ifelse(zoning_type != "Other", 1, 0)
)
zoning_summary <- filtered_properties %>%
filter(is_residential == 1) %>%
group_by(zoning_type) %>%
summarise(
total_properties = n(),
homestead_count = sum(homestead_exemption > 0, na.rm = TRUE),
pct_homestead = (homestead_count / total_properties) * 100
) %>%
arrange(desc(pct_homestead))
# Homestead rates by zoning type
zoning_summary_chart <- ggplot(zoning_summary,
aes(x = reorder(zoning_type, -pct_homestead),
y = pct_homestead)) +
geom_bar(stat = "identity",
fill = "#e42524",
alpha = 0.8) +
geom_text(aes(label = round(pct_homestead,1)),
vjust = -0.5,
size = 3) +
theme_minimal() +
theme(
axis.text.x = element_text(angle = 45, hjust = 1),
panel.grid.major.x = element_blank(),
plot.title = element_text(size = 18, face = "bold"),
plot.subtitle = element_text(size = 14),
axis.title = element_text(size = 14),
axis.text = element_text(size = 14),
legend.text = element_text(size = 14)
) +
labs(
title = "Homestead Exemption Rates by Zoning Category in Philadelphia",
subtitle = "Residential and Mixed-Use Districts Only",
x = "Zoning Category",
y = "Percentage with Homestead Exemption",
caption = "Source: Philadelphia Property Data, 2025"
) +
scale_y_continuous(
limits = c(0, max(zoning_summary$pct_homestead) * 1.1),
labels = function(x) paste0(x, "%")
)
zoning_summary_chart
ggsave("outputs/homestead-exemption-zoning_summary_chart.png", zoning_summary_chart, width = 10, height = 6)